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In this paper, we use a partition of the links of a network in order to uncover its community 
structure. This approach allows for communities to overlap at nodes, so that nodes may be in 
more than one community. We do this by making a node partition of the line graph of the original 
network. In this way we show that any algorithm which produces a partition of nodes can be used to 
produce a partition of links. We discuss the role of the degree heterogeneity and propose a weighted 
version of the line graph in order to account for this. 



I. INTRODUCTION 



Finding hidden patterns or regularities in data sets is 
a universal problem which has a long tradition in many 
disciplines from computer science [l[ to social sciences [2j . 
For example, when the data set can be represented as a 
graph, i.e. a set of elements and their pairwise relation- 
ships, one often searches for tightly knit sets of nodes, 
usually called communities or modules. The identifica- 
tion of such communities is particularly crucial for large 
network data sets that require new mathematical tools 
and computer algorithms for their interpretation. Most 
community detection methods find a partition of the set 
of nodes where most of the links are concentrated within 
the communities [H, H|. Here the communities are the 
elements of the partition, and so each node is in one and 
only one community. 

A popular class of algorithms seek to optimise the 
modularity Q of the partition of the nodes of a graph 
G @, i, 0, 0, 0- The simplest definition of modularity 
for an undirected graph, i.e. the adjacency matrix A is 
symmetric, is [Toj 
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where W — J2ij Aij aim fa = Ylj ^ij ^ s * ne degree of 
node i. The indices i and j run over the N nodes of 
the graph G. The index C runs over the communities of 
the partition V . Modularity counts the number of links 
between all pairs of nodes belonging to the same com- 
munity, and compares it to the expected number of such 
links for an equivalent random graph in which the degree 
of all nodes has been left unchanged. By construction 
\Q\ < 1 with larger Q indicating that more links remain 
within communities then would be expected in the ran- 
dom model. Uncovering a node partition which optimises 
modularity is therefore likely to produce useful commu- 
nities. 

This node partitioning approach has, however, the 
drawback that nodes are attributed to only one commu- 
nity, which may be an undesirable constraint for networks 
made of highly overlapping communities. This would be 
the case, for instance, for social networks, where indi- 
viduals typically belong to different communities, each 




FIG. 1: (Color online) By partitioning the links of a network 
into communities, one may uncover overlapping communities 
for the nodes by noting that a node belongs to the communi- 
ties of its links. In this toy example, a meaningful partition 
consists in dividing the links into two groups (solid blue lines 
and the dashed red lines). In that case, the central node be- 
longs to the two communities because it is at the interface 
between these link communities. 



characterised by a certain type of relation, e.g. friendship, 
family, or work. In scientific collaboration networks (for 
example [U), authors may belong to different research 
groups characterised by different research interests. Such 
inter-community individuals are often of great interest as 
they broker the flow of information between otherwise 
disconnected contacts, thereby connecting people with 
different ideas, interests and perspectives [ljj LL3| ■ 

Only a few alternative approaches have been proposed 
in order to uncover overlapping communities of nodes, for 
example [3, EH, EH] • Our suggestion is to define commu- 
nities as a partition of the links rather than of the set of 
nodes. A node may then have links belonging to several 
communities and in this it belongs to several communi- 
ties. The central node in a Bow Tie graph is a simple 
example, see Fig. [TJ This link partition approach should 
be especially efficient in situations when the nodes of a 
network are connected by different types of links, i.e. in 
situations where the nodes are heterogeneous while the 
links are very homogeneous. In the case of the social net- 
work mentioned above, this would occur when the friend- 
ship network and work network of individuals only have 
a very small overlap. 

This paper is organised as follows. In section [Til we 
review a definition of modularity which uses the statisti- 
cal properties of a dynamical process taking place on the 
nodes of a graph. In section IIIH we propose three dy- 
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namical processes taking place on the links of the graph 
and derive their corresponding modularities, now defined 
for a partition of the links of a network. To do so, we 
make connections to the concept of a line graph and with 
the projection of bipartite networks. In section IIV1 we 
optimise the three modularities for some examples and 
interpret our results. In section fVl we conclude and pro- 
pose ways to improve our method. 



II. DYNAMICAL FORMULATION OF 
MODULARITY 

To motivate our link partition quality function, let us 
first consider how to interpret the usual modularity Q 
(fll) in terms of a random walker moving on the nodes 
[17L [l8| . Suppose that the density of random walkers on 
node i at step n is pi- n and the dynamics is given by 



Pi;n+l 



k ■ 
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(2) 



From now on, we will only consider networks that are 
undirected (the adjacency matrix is symmetric), con- 
nected (there exists a path between all pairs of nodes), 
non-bipartite (it is not possible to divide the network into 
two sets of nodes such that there is no link between nodes 
of the same set), and simple (without self- loops nor mul- 
tiple links). If the first three conditions are respected, it 
is easy to show [l9[ that the stationary solution of the 
dynamics is generically given by p* — ki/W. 

Let us now consider a node partition V of the network 
and focus on one community C G V . If the system is at 
equilibrium, it is straightforward to show that the prob- 
ability a random walker is in C on two successive time 
steps is 
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while the probability of finding two independent walkers 
at nodes in C are 
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This observation allows us to reinterpret Q as a summa- 
tion over the communities of the difference of these two 
probabilities. This interpretation suggests natural gen- 
eralisations of modularity allowing to tune its resolution. 
Indeed, Q is based on paths of length one but it can 
readily be generalised to paths of arbitrary length as 
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where Ty = A^/kj. This quantity is called the stabil- 
ity of the partition [17] . Because kj is an eigenvector of 




FIG. 2: (Color online) Illustration of the two types of random 
walk considered in this paper. In both cases, the walkers are 
situated on the links of a graph, here starting from the central 
red dashed link. In (A) the "Link-Link random walk" is shown 
where the walker jumps (the green dashed arrows) to any of 
the adjacent links with equal probability. In (B) a "Link- 
Node-Link random walk" is illustrated. In this case the walker 
moves first to a neighbouring node with equal probability, and 
then moves on to a new link, chosen with equal probability 
from those new links incident at the node. 



eigenvalue one of T, one can show that the symmetric ma- 
trix X(n)ij = (T n )ijkj corresponds to a time-dependent 
graph where the degree of node i is always equal to fcj. 
Therefore i?(A, n) can be interpreted as the modularity 
of X(n)ij, a matrix that connects more and more dis- 
tant nodes of the original adjacency matrix A as time n 
grows [18[ . It can be shown that optimising typically 
leads to partitions made of larger and larger communities 
for increasing times and that the optimal partition when 
n — » 00 is made of two communities [17|, [18| . 



III. LINK PARTITION 
A. Random walking the links 

The above discussion suggests that we should look at 
a random walker moving on the links of network in order 
to define the quality of a link partition. Such a walker 
would therefore be located on the links instead of the 
nodes at each time n and move between adjacent links, 
i.e. links having one node in common. In the case of the 
random walk on the nodes (|2|) , a walker at node i follows 
one of its links with probability l/fcj, i.e. all links are 
treated equally. However, a link between nodes i and j 
is characterised by two quantities ki and kj , so a random 
walk on the links is more subtle. In the following, we will 
focus on two different types of dynamical process that 
account differently for the degrees fc, and kj (see Fig. [2]). 

In the first process, a walker jumps with the same prob- 
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ability l/(fc; + kj — 2) to one of the links leaving i and j. 
When ki ^ kj , the walker goes with a different probabil- 
ity through i or j, and we therefore call this process an 
"link-link random walk" (see Fig [2J^)- 

In the second process, a walker jumps to one of the 
two nodes too which it is attached, say i, then moves to 
an link attached to that node (excluding the link it came 
from). Thus it will arrive at an link leaving node i with 
a probability l/(2(fcj — 1)), and similarly it will arrive 
at a link attached to the other node j with probability 
l/(2(fcj — 1)). We will refer to this as a "link-node-link 
random walk" (see Fig[2j3). This process is well-defined 
unless the link is a leaf, namely one of its extremities has 
a degree one, say i. In that case, the walker will jump 
with a probability l/(kj — 1) to one of the links leaving 
3- 

These two types of dynamics are different in general 
except if the degrees at the extremities i and j of each 
link are equal. In the case of a connected graph, this con- 
dition is equivalent to demanding that the graph is reg- 
ular, i.e. the degree of all the nodes is a constant. When 
this condition is not respected, the link-link random walk 
favours the passage of the walker through the extremity 
having the largest degree. The difference between the two 
processes will be maximal when the network is strongly 
disassortative, namely when links typically relate nodes 
with very different degrees (2(| ■ 



B. Projecting the incidence matrix 

1. Bipartite structure 

In order to study these two types of random walk more 
carefully, it is useful to represent a network G by its inci- 
dence matrix B. The elements Bi a of this N x L matrix 
(L is the number of links) are equal to 1 if link a is re- 
lated to node i and otherwise. The incidence matrix 
of G may be seen as the adjacency matrix of a bipartite 
network, 1(G) (see FigJ3]B), the incidence graph 1 of G 
where the two types of nodes correspond to the nodes 
and the links of the original graph G. By construction, 
all the information of the graph is incorporated in B. For 
instance, the degree ki of a node i and the number of 
nodes k a attached to a link a (always equal to two) are 
given by 

a i 



An incidence graph is usually defined in terms of the incidence 
of a set of lines with a set of points in a Euclidean space of finite 
dimension. Here we have a special case where we imbed our 
graph G in some Euclidean space of no particular interest, and 
each link of G is a line which always intersects with exactly two 
points. 



The N x TV adjacency matrix A of the graph G can also 
be obtained 

Aij — S ' Bi a Bj a — kiSij . (7) 

a 

This operation ([7]) can be interpreted as a projection of 
the bipartite incidence graph 1(G) onto the unipartite 
network G [2~ll. |22| . In a similar way, an adjacency matrix 
for the links can be obtained by projecting the bipartite 
network onto its links. In the following, we will focus on 
two standard types of projection that, as we will show, 
are directly related to the two random walks introduced 
above. 



2. Line graph 

The simplest way to project a bipartite graph consists 
of taking all the nodes of one type for the nodes of the 
projected graph. A link is added between two nodes in 
this projected graph if these two nodes had at least one 
node of the other type in common in the original bipartite 
graph. The operation (0 is of this type. When applied 
to the links a of the graph G, the second type of vertex in 
the bipartite incidence graph 1(G), it leads to the L x L 
adjacency matrix C whose elements are 

Caf3 — 2_j B iaBip(\ — 8 a p). (8) 

i 

It is easy to verify that this adjacency matrix is symmet- 
ric and that its elements are equal to 1 if two links have 
one node in common, and zero otherwise. It is interesting 
to note that this adjacency matrix corresponds to another 
well known graph, usually called the line graph of G (23| 
and denoted by L(G) (see FigJ3]C). It is a simple graph 
with L nodes. By construction, each node i of degree ki of 
the original graph G corresponds to a fcj fully connected 
clique in L(G). Thus it has h(h - l)/2 = 0((k 2 )N) 
links. Line graphs have been studied extensively and 
among their well-known properties, Whitney's unique- 
ness theorem states that the structure of G can be recov- 
ered completely from its line graph L(G), for any graph 
other than a triangle or a star network of four nodes [24j . 
This result implies that projecting the incidence matrix 
onto L(G) does not lead to any loss of information from 
the network structure. This is a remarkable result that 
is not generally true when projecting generic bipartite 
networks. 

It is now straightforward to express the dynamics of 
link-link random walk (Figj^J^) in terms of the projected 
adjacency matrix C 

Pa-.n+l = Y] -TT-PP;n- (9) 

Now p a -n is the density of random walkers on link a 
at step 7i, k a — Y]a C a [j = (ki + kj — 2) and where i 
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(C,D) 





FIG. 3: (Color online) The information of the Bow Tie graph 
in (A), as encoded by the adjacency matrix A of Eqn. has 
other equivalent graph representations. In (B) the incidence 
matrix (B of Eqn. (J7]l) of the Bow Tie is shown as a bipartite 
network, the incidence graph 1(G). The line graph of the Bow 
Tie, L(G), is the unweighted version of the graph labelled 
(C,D), with adjacency matrix C of Eqn. JB). The weighted 
version in diagram (C,D) has an adjacency matrix D of Eqn. 

The weighted line graph with self loops, labelled (E) 
has an adjacency matrix E of Eqn. (|14|l . Circles represent 
entities which correspond to nodes of the original graph, while 
triangles come from links in the original graph. 



and j are the extremities of a. This dynamical process 
therefore only depends on the sum of the degrees i and 
j. The stationary solution is found to be p* = k a /W, 
where W — ^ a pC a p. When G is simple, then W = 
J2i{h — l)h- By reapplying the steps described in [lg| |. 
it is now straightforward to derive a quality function for 
the link partition V of the graph G 



-Y y 



c, 



a[3 



k a k{j 
W 



(10) 
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This is just the usual modularity {1} for a graph with 



adjacency matrix C. 

As we noted, a single node i in G leads to a connected 
clique of ki(ki — l)/2 links in the line graph L(G). This 
seems to suggest that the line graph L(G) gives too much 
prominence to the high degree nodes of the original graph 
G. Our response is to define a weighted line graph whose 
links are scaled by a factor of 0(l/ki). 



3. Weighted line graph 

In order to derive the quality of a link partition as- 
sociated to the link-node-link random walk, it is useful 
to project the incidence matrix in a different way and to 
define another graph D(G) with a symmetric adjacency 
matrix given by 



D, 



a/3 



1 



1 



(11) 



This weighted line graph has the intuitive property that 
the degree k a = D a p of a link a is equal to two (a link 
always has two extremities) unless a is a leaf in G (then 
k a = 1 except for one trivial case). For example this 
weighted line graph of the Bow Tie network is shown in 
FigJHp. Only if G is regular will this weighted line graph 
D(G) be equivalent (up to an overall scale) to the original 
unweighted line-graph L(G). 

This construction is a well-known method for project- 
ing bipartite networks. For instance in the case of collab- 
oration networks [ll[ the (fcj — 1) normalisation is justified 
by the desire that two authors should be less connected 
if they wrote a joint paper with many co-authors than a 
paper with few authors. 

This weighted line graph allows us to write the dy- 
namics of the link-nodc-link random walk in a natural 



way 



ED a /3 
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(12) 



and, by reusing the above arguments to define another 
quality function for the link partition V of a graph 



Q(D) 
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where W = ^2 a a D a p = 2L — L\ ca i is twice the number of 
links L minus the number of leaves in the original graph 
G, iicaf- Again, this is the same functional form as the 
usual modularity, Q(A) of {1}, only the adjacency matrix 
has changed. 



C. Projection of a node random walk 

The random walks proposed in the previous sections 
have been defined on the line graph, and therefore consist 
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of walkers moving among adjacent links of the original 
graph G. However, such processes can not be related to 
the original random walk ([3|) on the nodes of G, because a 
walker moving on links can pass at two subsequent steps 
through the same node of G while such self-loops are 
forbidden in |3|) . This observation suggests an alternative 
approach where the dynamics would be driven by the 
original random walk ([3]) but would be projected on the 
links of the network. To do so, let us assume that a walker 
has not moved yet and is located at node i. In that case, 
it is reasonable to assume that all the neighbouring links 
of i are connected by a weight The corresponding 

adjacency matrix E for the links is therefore given by 



E, 



a/3 



E 

i,fc;>0 



Bi a B 



i/3 



(14) 



and is based on an unconstrained unbiased two-step ran- 
dom walk on the bipartite incidence graph 1(G) 2 . Unlike 
our previous line graph constructions, C of (jHJ) and D of 
(fTTj) . this weighted line graph E(G) has self loops. It is 
illustrated for the Bow Tie graph in Figj3j3. All nodes a 
in E(G) have strength two, E a p = 2, reflecting the 
fact that the links in the original graph G all have two 
ends. 

E is constructed when a walker is located on a node and 
has not moved yet. The motion of the walker according 
to ([3]) generates a new adjacency matrix, Ei, defined as 



E\-at 



E 

i,ki>0 



B ia AijBi[3 



(15) 



where we note that Ei = EE — E. The corresponding 
graph is still regular with k a — ^2pEi- a p — 2, and it 
is again weighted with self-loops. The quality function 
associated with this dynamics is simply 



(16) 



where again W = 2L. 

This quality function is particularly interesting be- 
cause it has a simple relationship to the modularity of the 
original graph, Q(A) of {]]). To show this let us assign a 
weight V ac representing the strength of the membership 
of link a in community c. Such weights may be defined 
and constrained in many ways. For instance, in a link 



2 One might also try to argue that since an undirected link is both 
incoming and outgoing, we might deem it appropriate to allow a 
to a transitions in the link-link walk of Fig[2j\. That is we could 
define an unweighted line graph with self loops with adjacency 
matrix C a p = Bi a Bip. Since it differs from the standard 
unweighted line graph L(G) only by the addition of a self-loop 
to every node a, this can be interpreted within the scheme of [2"| 
who add self-loops to control the number and size of communities 
found. 



partition we have V ac V a d = S c d for any a, i.e. every link 
a belongs to just one community. In order to translate 
Vac into a community structure on the nodes, it is nat- 
ural to use the incidence matrix, B of |(7J) and to define 
the rectangular matrix Vi C through 



= E 



B;, 



-V a 



(17) 



If V ac is an link partition then the projected node com- 
munity structure V, c is simply the fraction of links in 
community c incident at node i. Also if ^ c V ac — 1 then 

so is J2c V *c = L 

Now using the definition of the adjacency matrix in 
([7|) , we find that the modularity of the original graph G 
for some node community Vi C is 



<3(-E,;{V«}) 



c,d a, {3 

= Q(A;{V ic }) 
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V jd (19) 
(20) 



Thus finding modularity optimal link partitions of the 
line graph with adjacency matrix Ei of (|15p . is equiva- 
lent to the optimisation of the modularity of the original 
graph but with a different constraint on the node commu- 
nity V,c from that imposed when finding node partitions. 



IV. EMPIRICAL ANALYSIS 
A. Methodology 

In the previous sections, we have proposed three qual- 
ity functions Q(C), Q(D) and Q(Ei) for the partition 
of the links of a network G. Each represents a differ- 
ent dynamical process and therefore explores the struc- 
ture of the original graph G in a different way. In or- 
der to tune the resolution of the optimal partitions, it is 
straightforward to define the stabilities R(C,n), R(D,n) 
and i?(Ei,n) of the three processes by generalising the 
concept of modularity to paths of arbitrary length (see 
section II) . The optimal partitions of these quality func- 
tions can be found by applying standard modularity opti- 
misation algorithms to the corresponding line graphs. In 
this paper, we have used two different algorithms 0, [1| 
and have verified that both algorithms give consistent 
results. 

As a first check, let us look at the Bow Tie graph of 
Figure [1] The optimisation of the three quality functions 
Q(C), Q(D) and Q(Ei) lead to the expected partition into 
two triangles, with the values Q(C)=0.1, Q(D) = 0.278, 
Q(Ei) = 0.167. In this case, the central node belongs 
equally to the two link communities, a situation which 
is a far superior way to split the network than a node 
partition. The best node partition gives Q(A) = 0.111 
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when three nodes in one triangle form one community 
and the remaining two nodes form a second community. 

In order to compare node partitions and link parti- 
tions in the following, we will use the idea of a 'boundary 
link' and a 'boundary node'. A boundary link of a node 
partition is one which connects two nodes from different 
communities. We will then define a boundary node of an 
link partition to be a node which is connected to links 
from more than one link community. Thus the central 
node of the Bow Tie graph is a boundary node. 



B. Karate Club 

A less contrived graph is the Karate club of Zachary Q , 
which is made of thirty four members. Historically this 
split into two distinct factions. It is standard to compare 
the partition produced by a community detection method 
to the actual split of the club. The node partition having 
the largest value of modularity Q(A) = 0.420 contains 
four communities, but the resolution can be lowered by 
optimising the stability i?(A, n) for larger values of n. 
When n is large enough, the optimal partition is always 
made of two communities (see Figure 0}, e.g. i?(A, 11) = 
0.078, that agree with Zachary's partition into "sink" 
and "source" communities Q using the Ford-Fulkerson 
binary community algorithm (26J. 

The link partitions found by optimising Q(C) = 0.5, 
Q(D) = 0.53 and Q(E X ) = 0.36 are shown in Fig. [5] 
They are respectively made of 4, 7 and 3 communities. 
These three partitions are consistent with the historical 
two-way split of the network, as the boundary links of 
the two-way partition of Fig. 0] are always connected to 
a boundary node of a link partition. In general, how- 
ever, the three optimal partitions are as different as their 
corresponding dynamical processes are. The most strik- 
ing difference is observed around node 1. In the node 
partition optimising Q(A), this node is connected to sev- 
eral boundary links and connects the community of nodes 
(5,6,7,11,17) to the rest of the network. Such a position 
is consistent with the link partitions obtained from Q(D) 
and Q(Ei), but not with the link partition optimising 
Q(C). In this latter case, one observes that node 1 is 
rather the focus of one of the link communities on the 
left hand side in Fig. [5j This difference originates from 
the high degree of node 1 which implies that a link-link 
random walk is biased to pass through this node (see 
Fig. [2]), and therefore heavily connects its adjacent links. 
This is a general problem of the unweighted line graph C 
that gives too much emphasis to high degree nodes (also 
noted in [2^|) and therefore tends to produces communi- 
ties centred around hubs. Such a problem does not take 
place for the weighted line graphs D and Ei, and in both 
these cases node 1 is a boundary node, part of several 
communities. The main difference between the optimal 
partitions of Q(D) and Q(Ei) is the number of the com- 
munities in each, as expected because the line graph Ei 
connects more distance links of the original graph than 



D. Let us also note that the optimal partition of Q(Ei) 
resembles very much the one of Q(A), as suggested by 

Before concluding, let illustrate how longer random 
walks can be used to tune the resolution of the link par- 
tition. We focus on the weighted line graph D, whose 
optimal partition into seven communities is difficult to 
compare against the standard two and four community 
node partitions of Fig. 2J Let us therefore focus on the 
stability i?(D, n), which is based on paths of length n of a 
random walker on D. As expected, larger and larger com- 
munities are uncovered when n is increased and, when n 
is large enough, we obtain a two way link partition (see 
FigEJ) that shows a perfect match with the node partition 
shown in FigJH 



C. Word Associations 

As a final example, let us use the University of South 
Florida Free Association Norms data set [28| to create 
a simple network 3 in the manner of [Til ]. We obtain 
a link partition by optimising the modularity for the 
weighted line graph D of (jTTJ) but where the null model 
term (k a k/3)/W 2 has been scaled by a factor of 10.0 in 
order to control the resolution Q and in this case obtain 
321 communities in the whole network. The correspond- 
ing quality function can be seen as a linear approximation 
of the stability R(D,n) 18]. It is easier to optimise for 
large networks. 

In FigJT] we show part of the network near the word 
'bright' which is part of eleven communities 4 . The topol- 
ogy of our communities is much less constrained than 
those of k-clique percolation [I4j] which means we can 
pick out a wider range of structures. There are some tight 
clique-like subsets, e.g. the names of the planets. At the 
other extreme the method finds more tree like structures 
such as the sequence 'lit-on-switch-lever-handle' which is 
the backbone of another community linked to bright. On 
the other hand this flexibility in the structure can pro- 
duce a confusing picture since many words are members 
of several communities though mostly having just one 
or two links per community. For instance for the word 



3 We take the sum of the two forward strengths of all pairs of 
normed word and add a link only if the total is greater than 
0.025. We end up with 5018 words connected by 58536 links and 
from this a line graph with 1266910 links is created. 

4 The eleven communities which contain 'bright' are well char- 
acterised by the following subsets of words:- ('brave', 'bold', 
'daring'), ('bright', 'light', 'sunshine'), ('gone', 'fade', 'dim'), 
('power', 'electric', 'lightening', 'flash'), ('brain', 'intelligence', 
'brilliant'), ('great', 'wonderful', 'gifted'), ('pen', 'paper', 'high- 
light'), ('handle', 'lit', 'on', 'switch', 'lever'), ('cloudy', 'gray', 
'shiny', 'sunny'), ('space', 'sky', 'moonlight', 'stars'), ('assume', 
'illusion', 'imagination', 'vivid'). However 'bright' has sixteen of 
its twenty nine links in the community containing 'sunshine' and 
'light' with just a single link to eight of its eleven communities. 
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FIG. 4: (Color online) Optimal node partitions for the unweighted Karate Club data of Zachary, notation as in 0]. On the 
left is the partition into two communities made by Zachary Q] using the Ford-Fulkerson binary community algorithm [2(|. It 
is also produced by optimising R(A, 11) of ([5|. The right hand figure shows the node partition with optimal Q(A) = 0.420 [27J 
which contains four communities. 




FIG. 5: (Color online) The optimal link partitions of (C) Q(C), (D) Q(D) and (E) Q{Ei) for the Karate Club. They contain 4, 
7 and 3 communities respectively. The two smallest communities in the centre of (D) consist of the links: (a) {(3,10), (10,34)}, 
(b) {(34,20), (1,20), (2,20)}. 
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FIG. 6: (Color online) Optimal link partition into two com- 
munities of the stability -R(D, 10) of the Karate club. 




FIG. 7: (Color online) The simple graph created from the 
South Florida Free Association Norms data |28|, in the man- 
ner of [14J. The link partition shown is produced by optimis- 
ing a modified version of the modularity Q(D) where the null 
model factor was 10.0 x (k a k,s)/W 2 . This controls the num- 
ber of communities found @. The subgraph shown contains 
the word 'bright' along with nodes which have at least 90% 
of their links in one of the communities connected to 'bright'. 

'bright', it is linked to eight of its eleven communities by 
just one link. However one can exploit this feature to 
start to define strength of membership in different com- 
munities. For instance for visualisation, we have found 
it useful to view only those words which have a large 
number of links within one community, as in FigJT] 

V. DISCUSSION 

When describing a network, there seems to be a nat- 
ural tendency to put the emphasis on its nodes whereas 
a graph is a both a set of nodes and a set of links. It is 
therefore not surprising that node partitioning has been 



studied extensively in recent years while link partition- 
ing has been overlooked so far. In this paper, we have 
shown that the quality of a link partition can be evalu- 
ated by the modularity of its corresponding line graph. 
We have highlighted that optimising the modularity of 
some of our weighted line graphs uncovers meaningful 
link partitions. Our approach has several advantages. A 
key criticism of the popular node partitioning methods is 
that a node must be in one single community whereas it 
is often more appropriate to attribute a node to several 
different communities. Link partitioning overcomes this 
limitation in a natural way. Moreover, the equivalence 
of a link partition of a graph G with the node partition- 
ing of the corresponding line graph L(G) means that one 
can use existing node partitioning code with only the 
expense of producing a line graph transformation and 
an 0({k 2 ) I ' (k)) increase in memory to accommodate the 
larger line graph. Even the memory cost can be reduced 
to be 0(1) since we have shown our link partitioning 
is equivalent to a process occurring on the links of the 
original graph G, so a line graph need not be produced 
explicitly. 

Our method can be seen as a generalisation of the pop- 
ular k-clique percolation [3], which finds sets of con- 
nected k-cliques. By way of comparison we find collec- 
tions of two-cliques which are more densely connected 
than expected in an equivalent null model. Thus the link 
partitioning of our paper can be seen as an extension of 
two-clique percolation that allows for the uncovering of 
finer modules, i.e. two-clique percolation trivially uncov- 
ers connected components. An interesting generalisation 
would be to apply our approach to the case of triangles, 
4-cliques, etc. To do so, one has to replace the incidence 
matrix (relating nodes and links) by a more general bi- 
partite graph, representing the membership of nodes in 
a clique of interest. Our random walk analysis in terms 
of this bipartite graph would then proceed in the same 
fashion, and should allow to uncover finer modules than 
those obtained by k-clique percolation. 

All our expressions also hold for the case of weighted 
networks. Even multicdges can be accommodated if wc 
start from the incidence matrix, B. However the beauty 
of our approach is that any type of graph analysis, be it 
community detection or something else, can be applied 
to a line graph rather than the original graph. In this 
way, one can view a network from a completely different 
angle yet use well established techniques to obtain fresh 
information about its structure. 



Acknowledgments 

R.L. would like to thank M. Barahona and V. Eguiluz 
for interesting discussions, and UK EPSRC for support. 
After this work was finished, we received the paper of 
Ahn et al. [29( who also look at edge partitions but not 
in terms of weighted line graphs. 



9 



M. Fiedler, Czechoslovak Mathematical Journal 25, 619- [16 
633 (1975). 

W. Zachary, Journal of Anthropological Research 33, 452 [17 
(1977). 

S. Fortunato and C. Castellano, in Encyclopedia of [18 
Complexity and System Science edited by R.A. Meyers, 
(Springer- Verlag 2009. [19 
M.A. Porter, J.-P, Onnela, P.J. Mucha, Communities in 
Networks, arXiv: 0902. 3788. [20 
M.E.J. Newman, Phys. Rev. E 69, 066133 (2004). [21 
R. Guimera, M. Sales-Pardo and L.A.N. Amaral, Phys. 
Rev. E 70, 025101(R) (2004). [22 
V.D. Blondel, J.-L. Guillaume, R. Lambiotte and E. 
Lefebvre, J. Stat. Mech., P10008 (2008). [23 
A. Noack and R. Rotta, Lecture Notes in Computer Sci- 
ence 5526, 257-268 (2009). [24 
J. Reichardt and S. Bornholdt, Phys. Rev. E 74, 016110 
(2006). [25 
M. Girvan and M.E.J. Newman, Proc. Natl. Acad. Sci. 
USA 99, 7821 (2002). [26 
M. E. J. Newman, Phys. Rev. E 64, 016131 (2001). 
R.S. Burt, American Journal of Sociology 110, 349-399 [27 
(2004). 

R. Lambiotte and P. Panzarasa, Journal of Informetrics [28 
3, 180-190 (2009). 

G. Palla, I. Derenyi, I. Farkas and T. Vicsek, Nature 435, 
814 (2005). 

V. Nicosia, G. Mangioni, V. Carchiolo and M. Malgeri, [29] 
J. Stat. Mech. P03024 (2009). 



A. Lancichinetti, S. Fortunato, J. Kertesz, New J. Phys. 
11, 033015 (2009). 

J.-C. Delvenne, S. Yaliraki and M. Barahona, 
arXiv:0812.1811. 

R. Lambiotte, J.-C. Delvenne and M. Barahona, 
arXiv: 0812. 1770. 

F. R.K. Chung, Spectral Graph Theory, CBMS Regional 
Conference Series in Mathematics. 

M.E.J. Newman, Phys. Rev. Lett. 89, 208701 (2002). 
T. Zhou, J. Ren, M. Medo and Y.-C. Zhang, Phys. Rev. 
E 76, 046115 (2007). 

R. Lambiotte, and M. Ausloos, Phys. Rev. E 72, 066107 
(2005). 

V.K. Balakrishnan, Schaum's Outline of Graph Theory 

(Mcgraw- Hill Publ. Comp., New York, 1997). 

H. Whitney, American Journal of Mathematics 54, 150 

(1932). 

A. Arenas, A. Fernandez, and S. Gomez, New Journal of 
Physics 10, 053039 (2008). 

L.R. Ford and D.R. Fulkerson, Canadian Journal of 
Mathematics, 8, 399-404 (1956). 

G. Agarwal and D. Kempe, Eur. Phys. J. B 66, 409-418 
(2008). 

D.L. Nelson, C.L. McEvoy 
The University of South 
sociation, rhyme, and word fragment 
http : / /www . usf . edu/FreeAssociat ion/. 
Y.-Y.Ahn, J.P.Bagrow, S.Lehmann, arXiv: 0903. 3178. 



and TA. 
Florida 



Schreiber, 
word as- 
norms 



